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Abstract 

Quadratically optimized polynomials are described which are useful in 
multi-bosonic algorithms for Monte Carlo simulations of quantum field the- 
ories with fermions. Algorithms for the computation of the coefficients and 
roots of these polynomials are described and their implementation in the al- 
gebraic manipulation language Maple is discussed. Tests of the evaluation of 
polynomials on dynamical fermion configurations are performed. In a simple 
special case the obtained polynomial approximations are compared to Cheby- 
shev polynomials. 
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1 Introduction 



The numerical simulation of quantum field theories with fermions is an interesting and 
difficult computational problem. The basic difficulty is the necessary calculation of very 
large determinants, the determinants of fermion matrices, which can only be achieved by 
some stochastic procedure with the help of auxiliary bosonic "pseudofermion" fields (for 
general background see, for instance, []]]). 

A promising new approach proposed by Luscher || is based on polynomial approxi- 
mations of some negative powers x~ a of the fermion matrix. In this case the number of 
auxiliary bosonic fields is equal to the order of the polynomial and the resulting multi- 
bosonic action can be treated by simple methods known from bosonic quantum field 
theories without fermions. The performance of such a multi-bo sonic algorithm can be im- 
proved in a two-step polynomial approximation scheme [[J , where the number of bosonic 
fields is equal to the order of a first, relatively low order, polynomial only realizing a mod- 
est approximation of x~ a . The required high precision is achieved in a "noisy" correction 
step as proposed some time ago by Kennedy and Kuti 0] . In the correction step some high 
order polynomial approximations of x~ a /P(x) are used, where P(x) is some polynomial. 
The necessary polynomials can be obtained by minimizing the integral of squared relative 
deviations in an interval containing the eigenvalue spectrum of the fermion matrix. 

The computation of the quadratically optimized polynomial approximations of func- 
tions of general type x~ a /P(x), which are required in the two-step multi-bosonic algo- 
rithms, has been briefly outlined in ref. 0. It can be best done by using the arbitrarily 
high precision facilities of an algebraic manipulation program, as for instance Maple V. 
In the present paper the algorithms for the calculation of the coefficients and roots of 
these polynomials are described in more detail. The optimal ordering of the roots in 
applying the polynomials of the fermion matrix in a product form is also discussed. This 
is necessary in order to keep rounding errors tolerable even for 32 bit arithmetics. 

These kinds of quadratically optimized polynomial approximations belong to the gen- 
eral class of "least-squares" or "Gaussian-" approximations ||. The coefficients of the 
polynomials can be obtained, for instance, by expanding the function to be approximated 
in terms of suitably defined orthogonal polynomials. The expansion in orthogonal polyno- 
mials provides the possibility of a recursive evaluation without knowing the roots, which 
is advantageous from the point of view of rounding errors. In case if the roots of the ap- 
proximating polynomials are also required, the calculation of the polynomial coefficients 
is the smaller part of the calculation. The larger part is to determine the roots with 
sufficient precision and find their optimal ordering. 

Another possibility, besides the quadratic optimization, is to consider "minimax" or 
"infinity-norm" approximation schemes, where the maximum deviation in the region of 
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approximation is minimized ||. For instance, in ref. || this has been used to approximate 
the function x~ l by Chebyshev polynomials. In multi-bosonic algorithms, in general, the 
quadratic optimization seems more appealing, because it provides better approximations 
in the average. In addition, the advantage of the quadratic optimization is its great 
flexibility. For instance, besides the possibility to choose a wide class of functions, it 
also allows for arbitrary complex regions and general weight factors which also take into 
account the average eigenvalue distribution of the fermion matrix. Despite this generality, 
in all cases one can use essentially the same algorithm to determine the polynomials. 
Nevertheless, in the case of the function x _1 the choice between minimax and least-squares 
approximation can only be decided by practical tests and the outcome may also depend 
on the preferences of a particular application. (The same applies to the generalization of 
the Chebyshev approximation to the more general case of x~ a which has been proposed 
recently by Bunk ||.) 

The plan of this paper is as follows: In the next section first the definitions and some 
basic properties of the polynomials necessary in the first step of the two-step multi-bosonic 
algorithm for fermions are collected. In subsection |2.1| the questions of the expansion 
in orthogonal polynomials are discussed. The Maple procedures for the calculation of 
quadratically optimized polynomials and for obtaining the optimal orderings of their roots 
are considered in section [|, together with a few examples. Possible generalizations of these 
results are discussed in section |. These include the polynomials needed in the second 
(correction-) step of the fermion algorithm and also some other cases useful for other 
variants of multi-bosonic algorithms. Some tests are included in section ||, together with 
comparisons to the minimax approximation in the special case of the function The 
last section contains a summary and concluding remarks. 

2 Definitions and some basic properties 

Quadratically optimized approximations of a function f(x) by polynomials P(x) in an 
interval x 6 [e, A] can be defined by minimizing the integral 

J"dx[(f(x)-P(x))w(x)} 2 . (1) 

Here w(x) is an arbitrary weight function. Minimizing the relative deviation means to 
choose w(x) = l/f(x). Here we shall first mainly consider some negative power f(x) = 
x~ a and a positive interval < e < A. (The required generalization to f(x) = x~ a /P(x) 
and more general regions will be discussed in section [|.) In the multi-bosonic algorithm 
for fermions we need a = hNf, with Nf the number of identical fermion flavours. For 
numerical simulations with gluinos, which are Majorana fermions, we have Nf — ~. In 
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QCD Nf = 1,2,3 are relevant, depending on the assumptions made on the values of quark 
masses. 

Therefore, we shall minimize the relative deviation norm 
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(A-e)- 1 ^ dx [1 - x a P{x)f \ . (2) 



The normalization factor in front of the integral is, of course, not necessary. It is intro- 
duced for convenience. Taking the square root is also a matter of convention. 

5 2 is a quadratic form in the coefficients of the polynomial which can be straightfor- 
wardly minimized. Let us denote the polynomial corresponding to the minimum of 5 
by 

n 

P n (a;e, X;x) = ^c„^(a;e, X)x n ~ v . (3) 

v=Q 

Performing the integral in S 2 term by term we obtain 

n n 

5 2 = l-2^c,K (a) + E c Ul Mffu 2 Cu 2 , (4) 

where 

\l+a+n— u A+a+n—u 
Via) A 



(X - e)(l + a + n - u) ' 

\ l+2ct+2n— v\— 1/2 A+2a+2n—v 1 —V2 

U1 ' U2 (A-e)(l + 2a + 2n-i/i-i/ 2 ) ' K} 
Note that the dependence of V and M on n comes only from the dimensions. This can 
be made explicit by introducing in @ the coefficients c nu = c n>n _ v . Then everywhere in 
(||) we replace (n — v) — > v. 

The coefficients of the polynomial corresponding to the minimum of 5 2 , or of 5, are 

n 

c u = c nu (a; e, A) = £ < H ^ a) - (6) 

i/i=0 

The value at the minimum is 

n 

S 2 = S 2 n (a;e,X) = l- £ V^M^V^ . (7) 

1/1,1/2=0 

Scaling the integration variable x by x' = px one obtains the following scaling prop- 
erties of the optimized polynomials: 

6l(a;ep,Xp) = 5 2 (a; e, A) , 

P n (a; ep, Xp; x) = p~ a P n (a; e, A; x/p) , c nu (a; ep, Xp) = p n ~ v ~ a c nu (a\ e, A) . (8) 
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This allows for only considering, for instance, the standard intervals [e/A, 1] and obtain 
other cases by these scaling relations. 

In applications to multi-bosonic algorithms for fermions the decomposition of the 
optimized polynomials as a product of root-factors is needed. This can be written as 

n 

P n (a; e, A; x) = c n0 (a; e, A) JJ[x — r nj (a; e, A)] . (9) 

The scaling properties here are: 

c n0 (a; ep, Ap) = p n ~ a c n0 (a; e, A) , r nj (a; ep, Ap) = p j r nj (a; e, A) . (10) 

The numerical calculation of roots will be discussed in the next section. Since the coeffi- 
cients of the polynomials are real, the roots are either real or occur in complex conjugate 
pairs. In the present case, for n = even the roots always occur in pairs with non-zero 
imaginary parts. For n = odd there is a single real root above the upper limit of the 
interval A, and the other roots are in complex conjugate pairs. 

The above formulae also apply in the limit e — > 0. Other generalizations will be 
discussed in section |. 

From the minimum property of the optimized polynomials and from the positivity of 
the integrand in eq. @ one can derive interesting inequalities for the relative deviation 
norm 5. For < e' < e we obtain 

(l-e)#(a;e,l) < (1 - e')^(a; e', 1) . (11) 

Of course, for m > n we also have S m (a; e, A) < S n (a; e, A). One can also prove 

<£(a;0,l)-e < (1 - e)S 2 n (a; e, 1) , (12) 

and for integer k > 2 and large enough n 

5 2 kn (ka;e,\) < k 2 5%(a; e, A) . (13) 

In general, it does not seem to be possible to derive explicit formulae for 5. In some 
special cases, however, explicit algebraic calculations in Maple give interesting illustrative 
results. For instance, in the important special case a = |, we have 

*I(~;e 2 ,i) = 

(e 10 + 20e 9 + 105e 8 + 320e 7 + 580e 6 + 720e 5 + 580e 4 + 320e 3 + 105e 2 + 20e + l)(e - l) 10 

121(e + l) 10 (e 8 + 24e 6 + 76e 4 + 24e 2 + l)(e 2 + 1) ' 

(14) 

In the limit e —>■ we have, for any a and at least for a wide range of orders n where 
explicit calculation could be performed, 



2.1 Expansion in orthogonal polynomials 

A useful representation of the quadratically optimized polynomials is an expansion in 
suitably defined orthogonal polynomials. If the relative deviation from the function f(x) = 
x~ a /p(x) is minimized, the appropriate integration weight in (JJ) is 



w(x) 2 



x a P(x) 
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(16) 



f{x) 2 

In most cases this is the preferred choice for the weight function. Nevertheless, also other 
cases are of interest, for instance w(x) = 1 and f(x) = x a P(x), as in eqs. (58)-(59) of 
ref. Jfl. Therefore, let us leave for the moment w(x) and f(x) general. 

Let us define the orthogonal polynomials <& v (x) of order v — 0, 1, 2, . . . such that they 
satisfy 

dxw^xf^^x^^x) = 5^q u . (17) 

The arbitrary normalization factor q v will be chosen later on by convenience. The or- 
thogonal polynomials can be used, instead of the simple powers in (^), as a basis for the 
expansion of the optimized polynomials: 

n 

P n (a; e, A; x) = d nu{oc; e, A)* l/ (x) . (18) 

The advantage of this expansion is that now, as one can easily see, the matrix correspond- 
ing to in (H) is diagonal and the coefficients of the optimized polynomial are simply 
given by 

d rw (a\ e, A) = d n (a\ e, A) = — , (19) 

where 

A 

dxw(x) 2 f(x)$ u (x) . (20) 

An important property of this expansion is that, as shown by the notation, the expansion 
coefficients d nv = d v do not depend on the order of the optimized polynomial n. 

The relations in eqs. (p!7|)-(^0D show that fllTf ) is the truncated expansion of the function 
f(x) in terms of the basis defined by & u (x) in the Hilbert space of square integrable 
functions in the interval [e, A] with integration measure dx w(x) 2 . A general consequence 
is that for n — > oo the approximation is exponential, if the function is bounded as, for 
instance, f{x) = x~ a /P(x) for e > 0. 

One can easily show that the minimum of the relative deviation norm in ([?]), for 
simplicity in the case w(x) = 1/ f(x), can now be expressed as 



S 2 n = (A-e)- 1 [ X dx 



w(x) d v $ v {x) 

u=0 
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{X-e)- x f X dx 



n ~ n 

l-w{x)Y,du®M =l-{\-t)- 1 Y J duK- (21) 

v=0 J u=0 

For the determination of the orthogonal polynomials Q v (x) one can use recurrence 
relations ||. Let us define the expansion of $ M (a;) in simple powers by 

%{*) = itfr*r- v ■ (22) 

v=0 

The up to now arbitrary normalization factor q u in fli~7|) can be fixed by requiring 

/ M o=l, (/i = 0,l,2,...) • (23) 

The first two polynomials with \i — 0, 1 are: 

$ (x) = l, $ 1 (x)=x + / 11 = x--, (24) 

with the notation 

/■A 

s M = J dxw(x) 2 x^ . (25) 

The value of the coefficient fu is determined by the requirement of orthogonality of §o(x) 
and $i(x). The normalization factors go and gi are given by 

s 2 

<?o = s , ft = s 2 1 . (26) 

So 

The higher order polynomials $ M (x) for /i = 2, 3, . . . can be obtained from the three- 
term recurrence relation 

$ r +i(aO = (z + AOMz) + 7r-l$r-l(z) • (r = 1, 2, . . .) . (27) 

This relation follows from the fact that $ r+ i(x) — x$ r (a;) is a polynomial of order r, 
which is orthogonal to $ r _ 2 (a:), 3>r- 3(2;), • • •• Multiplying eq. (p7|) by it?(a;) 2( l ) r (x) and 
w(x) 2< l ) r _i(x) and integrating for x G [e, A] we obtain for the recurrence coefficients 

& = -— , 7,-i = , (28) 

where 



Pu = J dxw(x) 2 § u (x) 2 x . (29) 
Combining eqs. (|17D and (p9|) with fl22|) and (25) we also obtain 

P/i ^ , f[ivifiiU2Sl+2fj,—vi—V2 j ^ ] ffwiffwa^^fi— v\— vi ■ (30) 

Vl ,1*2=0 ^l,!/ 2 =0 

From the recurrence relation (P7j), together with eqs. ( pH ) and (PD|) , one can recursively 
determine the coefficients of the orthogonal polynomials f^ u . Calculating the expansion 
coefficients of the quadratically optimized polynomials from eqs. (|T8l) -(|20|) one obtains the 
desired polynomial approximation, without the knowledge of roots. Besides the represen- 
tation by a product of root-factors in (|9D, this offers an alternative recursive way for the 
evaluation of optimized polynomials. 



3 Algorithm in Maple and examples 



The calculation of the coefficients of optimized polynomials and their roots from the 
formulae of the previous section is, in principle, straightforward. The problem is, however, 
that for high orders n = (9(100) the rounding errors in floating point calculations become 
dangerous. This holds for the determination of the inverse matrix M( Q ) _1 in eq. for 
finding the roots of the polynomial required in (||) and for obtaining the coefficients in 
the recursion relation in eq. fl27|). Once the roots are found to sufficient precision, another 
problem is that applying the product representation (||), with x replaced by the (squared) 
fermion matrix, precision losses occur again. This is because the intermediate results 
can be widely different in magnitude for different parts of the eigenvalue spectrum. A 
remedy is to find an optimized ordering of roots for the product of root-factors. All these 
problems can be dealt with by procedures written in an algebraic manipulation program 
as Maple V. 

• Inversion and finding the roots: The inversion of the matrix defined in (|j|) 
can be done by the procedure linalg [inverse] included in the Maple V library. For 
the calculation of roots it is more convenient to write a special procedure rather 
than to use the Maple V library. The Laguerre iteration algorithm |7j is well suited 
and can be easily implemented. In the examples explicitly considered in this paper 
the coefficients of the polynomials are real. This could, in principle, be used to 
accelerate the root-finding algorithm but, for keeping the procedure general, it is 
better not to use the realness of the coefficients. This also allows for checking the 
precision by looking whether the roots are either real or occur in complex conjugate 
pairs, as they should. The necessary number of digits for calculating the roots is 
usually smaller than 2n. It does not depend much on the value of the power a. (In 
the tests mainly a = 1, |, ~ have been considered.) The precision requirements for 
matrix inversion and for finding the roots are roughly similar. 

The calculation on typical workstations for the interesting orders up to, say, n = 100 
can be performed within several hours with storage of several 10 Mbytes. Also higher 
orders with n equal to several hundreds are feasible with reasonable computational 
effort: one has to have in mind that once the optimized polynomials are found they 
will typically be used in fermionic Monte Carlo simulation runs lasting up to several 
months on the largest supercomputers. 

• Optimized ordering of roots: The product representation @, with x replaced by 
the squared fermion matrix X 2 , has to be applied to some vectors during the Monte 
Carlo simulations. This has to be done with care because of possible precision losses. 
In order to allow for a correct evaluation with 64-bit (or even 32-bit) floating point 
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arithmetics, one has to choose the ordering of the factors judiciously. Since in the 
intermediate steps of an iterative evaluation the vector components are collected as 
linear combinations of several terms, precision losses can occur when different parts 
of the eigenvalue spectrum are multiplied by widely different values of the partial 
products. This can be to a large extent avoided by optimizing the ordering of roots. 

It turned out that a good choice is to minimize the maximal ratio of the values 
x a P p (x) for x G [e, A], with P p (x) denoting the value of the partial product under 
consideration. In practice this can be achieved by chosing a discrete number of 
points {xi,x 2 , ■ ■ ■ ,xn} in the interval [e, A] and, in a given step, comparing the 
values of x a P (x)(x — rj) for different choices of the next rj. (P {x) is the optimized 
partial product obtained in the previous step.) The next root rj is chosen such 
that the maximal ratio of the values of x a P (x)(x — rj) for x G {xi,X2, ■ ■ ■ , x^} be 
minimal. The number of equidistant optimization points has been chosen typically 
to be N = 0{n). For a = ~, | and n ~ 100 — 200 even iV ~ n/3 — n/5 turned out 
to be sufficient. 

For some purposes it is better to hold complex conjugate root pairs together in the 
ordering. This is the case, for instance, if the splitting up of the polynomial in 
two complex conjugate halves is used, as discussed in section This ordering is 
somewhat worse for the whole polynomial, but it is necessary if the evaluation of 
the half-polynomials is required. 

An important characteristics of this ordering principle is that the choice of the 
next root depends on the previously chosen ones. Generalizations are possible, for 
instance, to decide on the basis of the maximal ratio in the next two or three or more 
steps, not only in the next step alone. In this way the choice of the next root will 
also depend on the later choices and the optimization develops a global character. 

Recursion coefficients: The recurrence relation (E7I) requires to calculation of the 
coefficients /3 r , j r -i. For this one first determines the values of the basic integrals s M 



in fl25|) and then uses eqs. (^8|) -(|30|). In this case the necessary number of digits is also 
high. For the simple functions x~ a it is somewhat lower than for the determination 
of roots, but in the more general case x~ a /P(x) the requirements are practically 
the same. 

An important question is the behaviour of the relative deviation norm 5 as a function 
of the condition number A/e and of the order n. For fermionic simulations with the 
multi-bosonic algorithms the powers a = 1, |, j are most interesting. I performed most 
of the tests with a — |, which corresponds to the case of Majorana gluinos considered 
in refs. |3|, || [|. Several tests have also been performed with a = 1 and a — |, which 
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Figure 1: The (squared) deviation norm 5 2 of the polynomial approximations of 
x -1 / 4 as function of the order for different values of A/e. The straight lines are fits 
to the last three points. The asterisks show the e/A — > limit given by eq. ( fL5|) . 
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Figure 2: The (squared) deviation norm 5 2 of the polynomial approximations of 
x~ x l 2 as function of the order for different values of A/e. The straight lines are fits 
to the last three points. The asterisks show the e/A — > limit given by eq. ( fL5|) . 
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Figure 3: The (squared) deviation norm 5 2 of the polynomial approximations of 
x~ l as function of the order for different values of A/e. The straight lines are fits to 
the last three points. The asterisks show the e/A — > limit given by eq. fll5l) . 
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are interesting, for instance, in numerical simulations of QCD with 2+1 light dynamical 
quarks. In present day fermion simulations condition numbers up to A/e = O(10 4 ) are in 
most cases sufficient. 

The behaviour of 5 2 in the interesting range of parameters is illustrated by figures [l], [§ 
and [| As is shown by the figures, the large-n asymptotic behaviour of S is well represented 
by the expected exponential decrease, already in the presented range or orders. In all three 
cases the observed behaviour is consistent with 



5 n (a;e/X, 1) ~ exp l-C a n^fe/\\ , (31) 



where the constants C a are approximately given by C\u ~ 2.3, Ci/ 2 — 2.2 and C\ ~ 2.0, 
respectively. In fact, the fitted constants in figs. [I] and |] are close to these values for 
A/e = 10 2 and 10 3 , but still about 10-20% higher for A/e = 10 4 . In fig. || the asymptotic 
behaviour is a reasonably good representation in the whole range for all three values of 
A/e. 

Figures to || also demonstrate that for relatively low orders 5 n (a; e/A, 1) evolve close 
to the upper limit 5„(a;0,l) given in eq. (|T5|). This is required by the inequality in 
eq. (|12"D, as long as 

ZTT » ■ ( 32 ) 

n + 1 + a v 

A rough estimate for the lowest n where the relatively fast exponential decrease in fl3~T|) 
sets in is given by the value of n where the two sides of (j32|) become equal. 



4 Generalizations 

The great advantage of the quadratic polynomial approximation scheme defined in sec- 
tion |2| is its flexibility towards other functions and/or other regions of approximation. For 
instance, in the two-step multi-bosonic algorithm for fermions || three different polyno- 
mial approximations are needed: the one considered up to now with a relatively low order 
n, denoted by P(x), and two others with higher orders, realizing a better approximation. 
One of them has to approximate the function x~ a /P(x) therefore the relative deviation 
norm of P(x) in eq. (0) is replaced by 

5= ^{X-e)- 1 j\x[l-x a P{x)P{x)fy . (33) 

The second higher order polynomial P(x) is similarly defined. For a possible definition 
see ref. JJ, or alternatively, take eq. ( |33"D with a = and P — > P, P->P |. 

In order to find the minimum of 5 in eq. (|33D one can proceed similarly to sections ^| 
and [3]. Denoting the coefficients of the polynomial P(x) by c p , the coefficients of the 
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quadratic form in (f|) are now replaced by 

^l+a+n—u+n—u ^1+a+n—u+n—u 

9=0 



K {a) = (A - E 



Cp : z z 

1 + a + n — is + n — v 



n \l+2a+2n—ui—V2+2n—ui—U2 A+2a+2n— v\— V2 J r2n— v\— V2 



Another kind of generalization, besides (p3|), is to consider different integration regions. 
For instance, the integration region can be split in several disjoint intervals. For integer 
power a the intervals can also be extended to negative x. As an interesting example, let us 
mention the case of integer a with two intervals, one positive and another negative, lying 
symmetrically around zero. Such approximations can be useful for considering Hermitean 
fermion matrices with spectra symmetric around zero. 

Other optimized polynomial approximations are useful in different variants of multi- 
bosonic algorithms proposed by de Forcrand et al. [K|. In particular, in the non- 
Hermitean algorithms optimized approximations in the complex plane are required. In 
this case the interesting powers are twice as large as in the above Hermitean case, namely 
a = | for Majorana fermions and, in general, a = Nf for Nf equal-mass flavours. The 
relative deviation norm corresponding to eq. (0) is now defined by 

5 _j f n dxdy\l-(x + iy) a P(x + iy)\ 2 \ h 
\ In dxd V J 

where TZ is a region in the complex plane containing the eigenvalue spectrum of the non- 
Hermitean fermion matrix. The coefficients of the polynomial P(x + iy) are now, in 
general, complex. (Nevertheless, in special cases as in the explicit examples considered 
below, they can still be real.) The quadratic form for S 2 is also generally complex: 

n n 

E [ c l v i a) + ^Cu] + E ^Mjg^, (36) 



where 



v (a) = In dxdyjx - iy) a+n v 
In dx d y 



, a) J n dxdyjx - iy^-^jx + iyT+n-vi 

1,1/2 ~ J n dxdy ■ (37) 

For the coefficients of the optimized polynomial eq. (|6|) still holds. 

In principle the shape of the complex region TZ can be quite arbitrary. However, for 
applications in multi-bosonic algorithms high polynomial orders are necessary, therefore 
it is advantageous to choose a region where the necessary integrals 

l klM = [ dxdy{x-iy) kl {x + iy) k2 , (38) 
Jn 
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with positive kx,k 2 , can be evaluated explicitly and the results are relatively simple. This 
is because the Maple procedures described in the previous section require high precision. 

In case of integer power a rectangular or elliptical shapes are well suited. Let us 
first consider a rectangle {x G [e, A]; y G [—6, 5}} which is symmetric around the real axis. 
This is appropriate for fermion simulation algorithms because usually, if A is an eigenvalue 
of the fermion matrix then its complex conjugate A* also is. The integral in (|38|), with 
positive integer k\, k 2 , is in this case the following: 

r V* V r-i^(«x-«) KiKfci-"i)!"2!(fc2-« 2 )! 

x fei,fc 2 - z^y i ) Liu 



Kl=0 K2=0 



kx\k 2 



(A 



l+k 1 -K 1 +k 2 — k 2 



(S 



1 + K1+K2 



-6) 



1 + K1+K2' 



(39) 



(1 + kx — «i + k 2 — k 2 ) (1 + «1 + k 2 ) 

In case of an ellipse which is symmetric around the real axis with centre at (x = 
| (A + e),y = 0) and half-axes of lengths ~(A — e) and 6 in the direction of the x- and 
y-axis, respectively, it is convenient to introduce the new integration variables 

2x — A — e y 



A-e 

The integral is then reduced to the unit circle C around the origin 

k 



(40) 



I, 



k\,k 2 



f c d£dri(±(\ + e) + ±(\- e)£ - ify) ' (I (A + e) + \{X - e)£ + l 5 V 
and a term-by-term evaluation gives: 



fc 2 



(41) 



1 fcl 

-(X — e)8 $ii+i2+i3,ki X] ^Ji+ia+is.fca 

U,i2,J3=0 ilJ2,i3=0 



(l + (_!)*»+.») ^ + (_1)W+J3^ (_ 1 )|fe-ja) »1 ! ^!»3'J1'J2'J3 



r 


K 1 + ^+j 2 )_ 


r 


|(1 + is + is)" 


r 


l + |(«2+j2 + i3+i3) 





A + e 



«i+ii 



^3 +.73 



(42) 



In the complex plane, up to now, we only considered a = integer powers. For half- 
integer 



a = A + 



(43) 



the evaluation of the integrals on rectangles or ellipses becomes already cumbersome. A 
simple possibility is to go to the square-root complex plane (£, 77) by the relations 



x + iy = (£ + ir/Y 



x = e 



v 2 , 



(44) 
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This transformation brings eq. (|3^) to the form 
o = < 



(45) 



Here 1Z is a region in the right half of the square-root complex plane such that its image 
under the mapping defined by (f44|) covers the eigenvalue spectrum of the fermion matrix 
in the original (x, y)-plane. After this transformation one can take, for instance, rectangles 
or ellipses in the (^,7])-plane and apply formulae as fl59p or fl4"2|), respectively 

Further generalizations are also possible: One can take, for instance, more complicated 
regions with boundaries given by polygons or ring shapes etc. Another possibility is to 
change the weight function w(x) in the definition of the deviation norm (|I])-(0). Taking 
e. g. w(x) = 1 means to consider absolute deviations instead of relative ones. In multi- 
bosonic algorithms a special choice of the weight function can help to improve the quality 
of the approximations, for instance, by choosing w(x) to be proportional to the average 
density of eigenvalues in the interval [e, A]. 



5 Evaluating the polynomials: tests and comparisons 

The very high precision needed for obtaining the quadratically optimized polynomials does 
not mean that there is a similar difficulty in evaluating the polynomials. An extensive 
testing in the running project with dynamical gluinos in an SU(2) gauge theory showed 
II 3 that, in fact, there is no practical problem with the evaluation. Here only a small part 
of the performed tests are shown as represenative examples. The choice of parameters and 
lattice configurations is motivated by a work in progress 0. For tests with the fermion 
matrix the (squared) even-odd preconditioned Hermitean fermion matrix is considered. 

As a simple case, let us begin with an example for the evaluation of the polynomials 
on the real axis. We consider the polynomial approximation of order n = 72 for 1/ P^{x) 
in the interval [e, A] = [0.0002, 3.5], where Pm{%) is a 48-th order optimized approximation 
of x~ l 1 4 / Pis(x) and P\§(x) is a 16-th order optimized approximation of x -1 / 4 in the same 
interval. The dependence of the result on the number of digits used is easily controlled 
by changing the Digits parameter in Maple. In figure ^ the ratios of the results are 
shown for Digits = 6 divided by Digits = 12, which display the evaluation errors for 



Digits = 6. In the left part of the figure the recursive evaluation discussed in section \Ll 
is used, in the right part the root-product form in (f|) with optimized ordering of roots, 
as described in section [| As one can see, both evaluations give small errors of the order 
(9(10~ 5 )-(!}(10~ 4 ). The errors in case of the root-product form are somewhat larger and 
not completely uniform in the interval. 
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Figure 4: The rounding errors with Digits = 6 for the evaluation of a quadratically 
optimized polynomial, as specified in the text, with recurrence (left) and with root- 
factors in optimized order (right). (Note the small scale difference in the two halves.) 

For testing the evaluation of polynomials P(X 2 ) of the square of the even-odd precon- 
ditioned Hermitean fermion matrix X randomly chosen configurations were taken from 
updating series at j3 — 2.3: at K — 0.185 on 8 3 ■ 16 lattice and at K — 0.190 on 6 3 ■ 12 lat- 
tice. Approximations were considered for x _1 / 4 with order m, for x~ 1 ^/P ril with order n 2 
and for 1/P n2 with order n^. In the two cases the orders were (jii = 14, n 2 = 40, = 96) 
and (jii = 16, n 2 = 60, n 3 = 96), respectively. The approximations were optimized in 
the intervals [e, A] = [0.001,3.4] and [0.0002,3.5], respectively. The tests were carried out 
on the CrayT90 of HLRZ at Jiilich with 64-bit arithmetics. For estimating the round- 
ing errors the polynomials were evaluated on random Gaussian starting vectors of unit 
length in three different ways and the components of the differences of result vectors were 
considered. 

The first way of evaluation was the root-product form in @. For defining the second 
evaluation, the pairs of complex conjugate roots were decomposed, as usual, according to 

(X 2 - r)(X 2 - r*) = (X - Pl )(X - p 2 )(X - p\){X - p*) (46) 

and the half-polynomial 

n 

P^{X) = ^\{{X-p nj ) (47) 

3=1 

was taken, which gives 

Pn{X 2 ) = P^ /2 (XyP^(X) . (48) 
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In this second case, in the optimized ordering of roots the complex conjugate pairs were 
kept together, as mentioned in section H. The third way of evaluation was the recursive 



one discussed in section gTT[ The performed arithmetics are in these three cases completely 
different, hence the rounding errors are also different. The obtained averages for the real 
and imaginary parts of the components of difference vectors, which would be zero for 
infinite precision, are shown in table [1|. Since the rounding errors are independent for 
the pairs of evaluations considered, the individual roundind errors of an evaluation are 
smaller than those of the corresponding pairs. The numbers in the table show that 

o-i > a 2 > 03 . (49) 

In general, all the three values are comfortably small. In fact, they are of the same order as 
the rounding errors in global sums of order 0(1) quantities over the lattice. The standard 
deviations given in the table, which were obtained by repeating the calculation 100 times 
with different random starting vectors, show that the rounding errors depend little on the 
starting vector. Similarly, the particular gauge configuration chosen from the updating 
did not matter either. The dependence on the polynomial order and type is illustrated 
on the same configurations by table |[ 



Table 1: The average value of the real and imaginary parts of components of differ- 
ence vectors 0"i2, o"i 3 , 0-23 for three different evaluations of the second polynomials 
P n2 , as defined in the text. The values in the table are given in units of 10~ 12 . 
The digits in parentheses give standard deviations as observed for different starting 
vectors. 



lattice 


0"12 


<7l3 


C"23 


8 3 ■ 16 


0.472(8) 


0.434(9) 


0.1569(7) 


6 3 ■ 12 


0.91(7) 


0.88(8) 


0.207(1) 



Table 2: The same as table for the three different polynomials considered. The 
case of root-product minus recursive evaluation is shown. 



lattice 


0"13 {P ni ) 


<7l3(-Pn 2 ) 


0-13(^3) 


8 3 • 16 


0.0402(3) 


0.434(9) 


0.1243(4) 


6 3 • 12 


0.0536(7) 


0.88(8) 


0.1473(9) 
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1/x, n=70, [epsilon, lambda] = [0 . 01, 100.0] , n_s=34 1/x, n=70, [ epsilon, lambda] =[ . 1 , 1 00 . ] , n_s=34 




Figure 5: The evolution of the partial products towards the function 1/x to be 
approximated in the interval [e, A] = [0.01,100.0]. The order of the polynomial is 
n = 70. The optimization of the root ordering is performed on n s = 34 points in 
the interval. The curves show the natural logarithms log |xP p (x)| for the partial 
products P p (x) with p — 1, .., 10 (left) and p = 61, .., 70 (right). 



Figure f| and tables [I] and || already show that in these typical cases the achieved 
optimization of root orderings is completely satisfactory. In any case, the best is to use 
the recurrence relations and not the root-factor products, if possible. The quality of root 
orderings can also be demonstrated by showing the x-dependence of the partial products of 
root-factors P p (x) for p = 1, 2, . . . , n. This is shown in a case by figure [|. Remember that 
during the optimization the depicted curves were tried to be kept as close to horizontal 
lines as possible. 

The third polynomial considered in the above examples (and in the two-step multi- 
bosonic algorithm) is the approximate inverse of the second one. In this case the indepen- 
dence of the polynomial coefficients on the final maximal order can be used to monitor 
the length of the residue vector of the inversion r$: 

r = \v - P n2 {X 2 )P n ^ (X 2 )v \ (50) 

and stop the iteration if some required precision is achieved. Here vq is the starting 
vector and P n > 3 is a polynomial with smaller order than P„ 3 . In these tests, besides 
random Gaussian vectors of unit length also local starting vectors with a single non-zero 
component were considered from a randomly chosen site. The interval of optimization 
was also changed. In figure |6|, which was obtained on an 8 3 • 16 configration at (/3 = 
2.3, K = 0.185), the order of the second polynomial is n 2 = 40. As before, the curves 
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Figure 6: The length-square of the residue vector as a function of the polynomial 
order n = n' 3 . The inverse of an optimized polynomial of order ni = 40 is calculated. 
The polynomials with order n are optimized on the intervals [e, A] = [0.001,3.4] or 
[0.003, 3.4] and the starting vectors are random Gaussian (r) or local from a random 
starting point (/), as shown at the curves. 
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depend only very little on the starting vector with a given type. Actually, the curves are 
superpositions for three different random starting vectors of the same type. 

This type of inversion for polynomials with values close to one compares rather 
favourably with more conventional ones as, for instance, conjugate gradient iteration. 
(Remember that the values of P n2 are close to one because it is an approximation to 
x~ l 1 4 / P ni (x) .) In fact, in order to obtain the same precision in r , one needs 4 CG itera- 
tion steps corresponding to Wn-i = 440 matrix multiplications with X 2 , as compared to 
n = 90 in the figure. (Here the calculation of the residues from ( |o!l| ) is not counted, as it 
is not necessary for the inversion.) 

In the special case of the function x~ l the quadratically optimized polynomials may be 
compared to the Chebyshev polynomials used in 0, which minimize the maximal relative 
deviation. The asymptotic behaviour of 5 for the Chebyshev polynomials is also given 



by a formula as (|3TD with C\ = 2. However, the constants in front of the exponent are 
quite different. It turns out that at low orders the quadratically optimized polynomials 
are much better almost everywhere in the interval of approximation [e, A]. There is only a 
very small piece near e where the relative deviation of the Chebyshev polynomial of same 
order is a little bit better. This is illustrated by figure [j]. The advantage of quadratic 
optimization becomes larger for larger condition numbers A/e. Concerning the roots: it 
turns out that in general the roots of the Chebyshev polynomials are much closer to 
the real axis that those of the quadratically optimized ones. For instance, in case of 
the polynomials in figure [7| the absolute values of the imaginary parts are an order of 
magnitude smaller. 

Of course, if really the maximal relative deviation matters then the Chebyshev poly- 
nomials are better by definition. In the two-step multi-bosonic algorithm || || |9| there 
are two places where the quality of the polynomial approximations have an essential in- 
fluence on the performance. First, one wants to achieve a good acceptance rate in the 
noisy correction step for a low order first polynomial and, second, one wants to perform 
the correction on random Gaussian vectors with second and third polynomials of order 
as low as possible. Concerning acceptance, the experience shows || that an acceptance 
about 90% can be achieved on 8 3 • 16 lattices as a whole at large condition numbers above 
10 4 with a quadratically optimized first polynomial of order ri\ < 16. In order to compare 
to Chebyshev polynomials, the lengths of the residue vectors 

ri = \vo-X 2 P n (X 2 )v \ (51) 

were considered on random Gaussian and local starting vectors from a randomly chosen 
site. (In both cases the starting vectors had unit length.) In order to see the dependence 
on the starting vector, this was repeated 100 times for different vq. The results on 8 3 • 16 
and 6 3 • 12 configurations are shown in table ^| and |], respectively. One can see that 
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relative deviation from 1/x: n=16, in [0.0002,3.5] 




Figure 7: The comparison of the relative deviation R(x) = xP(x) — 1 for the 
Chebyshev polynomial (Rc(x)) and the quadratically optimized polynomial (Ro(x)). 
In the example shown the approximation interval is [0.0002, 3.5] and the order n — 16 
is taken in both cases. The lower part of the figure zooms on the two ends of 
the interval. In the left lower corner the maximum deviation of the Chebyshev 
polynomial is smaller: i?c(0.0002) = -0.968 compared to i?o(0.0002) = -0.991. 
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the corresponding values for quadratic optimization are by a factor 15-200 smaller. For 
instance, if one would require the same residues in case of Chebyshev approximation as 
for the lowest order quadratic optimizations shown in the tables, one would need roughly 
80-th order instead of 14-th order in table |3] and 150-th order instead of 16-th order in 
table [|. These conclusions do not depend on the particular configuration. If one picks 
some other gauge configurations from the updating series, the numbers in tables |3] and [4] 
do not change more than 20-30% and the ratios between the left two and right two columns 
even less. The smaller polynomial orders result in non-trivial gains of performance in the 
two-step multi-bosonic algorithm, for instance, because the autocorrelations are roughly 
proportional to the order n\ of the first polynomials. 



Table 3: Comparison of the lengths of residue vectors r\ defined in at different 
orders on a 8 3 • 16 lattice at (/3 = 2.3, if = 0.185). The starting vectors of unit 
length vq are either random Gaussian or local with a single non-zero component 
from a randomly chosen site. The digits in parentheses give standard deviations as 
observed for different starting vectors of the same type. 





Chebyshev 


quadratic 


order 


random 


local 


random 


local 


14 


0.628(1) 


0.441(2) 


0.0417(6) 


0.0242(12) 


40 


0.562(1) 


0.231(1) 


0.0093(2) 


0.00725(9) 


96 


0.0569(2) 


0.0359(1) 


0.00132(2) 


0.00074(3) 



Table 4: The same as table ||| on a 6 3 • 12 lattice at {(3 = 2.3, K = 0.190). 





Chebyshev 


quadratic 


order 


random 


local 


random 


local 


16 


1.145(4) 


0.484(3) 


0.039(2) 


0.026(3) 


60 


0.813(3) 


0.343(2) 


0.0072(11) 


0.0051(9) 


96 


0.483(2) 


0.219(1) 


0.0025(2) 


0.0017(2) 


192 


0.1017(3) 


0.0538(4) 


0.00081(8) 


0.00050(7) 
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6 Summary 



In this paper the quadratically optimized polynomial approximations necessary for multi- 
bosonic algorithms of fermion simulations are considered. 

In section |2| the definitions and basic properties of this scheme are introduced in a 
simple case, namely, optimization of the relative quadratic deviation from the function 
x~ a (a > 0) in a positive interval x G [e, A] (0 < e < A). The expansion in suitably 
defined orthogonal polynomials is also considered. This allows for a recursive evaluation 
of the polynomials, without the knowledge of the roots. 

An algorithm in the algebraic manipulation language Maple V is described in section |3|. 
With its help the coefficients and roots of the optimized polynomials can be determined, 
together with an optimal ordering of the roots for the application of the polynomial of 
the fermion matrix in product form with floating point arithmetics. 

In section f| generalizations are discussed which are necessary in different variants of 
multi-bosonic algorithms: approximations with products of polynomials and extension of 
the region of approximation from the real axis to the complex plane. 

In general, the calculations with the given algorithms can easily be performed on 
modern workstations for polynomial orders as high as n = 100 — 300. These are typically 
the maximal orders one needs in present day numerical simulations of fermionic quantum 
field theories. This is illustrated by figures [l], |] and || where very good approximations 
with small 5 are achieved already below n = 100. Experience in SU(2) Yang- Mills theory 
with gluinos tells || |9| that for the interesting values of A/e ~ O(10 3 ), with rather high 
statistics, practically no deviation of the expectation values can be observed already for 
5 2 ~ (9(10~ 6 ). In these figures the simple polynomials defined in section ^ are considered, 
but very similar results hold also for the other two polynomials needed in the two-step 
multi-bosonic algorithm || ^ , which are discussed in section £|. 

There is no principal obstacle to extend the calculations to higher orders, but then 
the requirements on computer power increase and further improvements of the Maple 
algorithms are welcome. 

Detailed tests for the evaluation of the optimized polynomials are shown in section [5] 
in typical situations relevant in numerical simulations with dynamical fermions. These 
are based on experience gained in the running project with dynamical gluinos in an SU(2) 
gauge theory || [| . A comparison with Chebyshev polynomials in the special case of the 
function x~~ l is favourable for quadratic optimization. 

The main advantage of the quadratic (or least-squares) optimization is its generality 
and flexibility concerning the choice of functions and regions for the approximation. This is 
welcome in present and future large scale numerical simulations with dynamical fermions. 
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